High-Throughput Coherence Controllers

نویسندگان

  • Ashwini K. Nanda
  • Anthony-Trung Nguyen
  • Maged M. Michael
  • Douglas J. Joseph
چکیده

Recent research shows that the occupancy of the coherence controllers is a major performance bottleneck for distributed cache coherent shared memory multiprocessors. In this paper we study three approaches to alleviating this problem in hardwired coherence controllers, namely, multiple protocol engines, pipelined protocol engines, and split request-response streams. Split request-response streams is an innovative contribution of this paper. The performance of pipelining in the context of coherence controllers has not been presented in the literature. Multiple protocol engines has not been studied in the context of hardwired controllers except for a study of ours and only to a limited extent. Using both commercial and scientific benchmarks on detailed simulation models, we present experimental results that show that each mechanism is highly effective at reducing controller occupancy by as much as 66% and improving execution time by as much as 51%, for applications with high communication bandwidth requirement. A combination of mechanisms further reduces controller occupancy and execution time by as much as 78% and 61%, respectively. Our results show that applying any of the parallel mechanisms in the coherence controllers allows integrating four times as many processors per coherence controller, thus reducing system cost, while maintaining or even exceeding the performance of systems with larger number of coher-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Liquid crystal tunable filters and polarization controllers for biomedical optical imaging

Liquid crystal (LC) devices exhibit fast and strong tuning and switching capabilities using small voltages and can be miniaturized thus have a great potential to be used with miniature optical imaging systems for biomedical applications. LC devices designed specifically for integration into biomedical optical imaging systems are presented. Using a combination of one or two LC retarders we obtai...

متن کامل

Using Inflight Chains To Build A Scalable Cache Coherence Protocol Using In-flight Chains to Build a Scalable Cache Coherence Protocol SAMANTIKA SUBRAMANIAM, INTEL CORPORATION SIMON C. STEELY, INTEL CORPORATION WILL HASENPLAUGH, INTEL CORPORATION and MIT

As microprocessor designs integrate more cores, scalability of cache coherence protocols becomes a challenging problem. Most directory-based protocols avoid races by using blocking tag-directories which can impact the performance of parallel applications. In this paper we first quantitatively demonstrate that state-of-the-art blocking protocols significantly constrain throughput at large core c...

متن کامل

Coherence Controller Architectures for Scalable Shared-Memory Multiprocessors

ÐScalable distributed shared-memory architectures rely on coherence controllers on each processing node to synthesize cache-coherent shared memory across the entire machine. The coherence controllers execute coherence protocol handlers that may be hardwired in custom hardware or programmed in a protocol processor within each coherence controller. Although custom hardware runs faster, a protocol...

متن کامل

High-speed high-resolution plasma spectroscopy using spatial-multiplex coherence imaging techniques

We have recently obtained the first simultaneous 2-d plasma Doppler spectroscopic images of plasma brightness, temperature and flow fields. Using compact polarization optical methods, quadrature images of the optical coherence of an isolated spectral line are multiplexed to four quadrants of a fast CCD camera. The simultaneously captured, but distinct images can be simply processed to unfold th...

متن کامل

Fusion Coherence: Scalable Cache Coherence for Heterogeneous Kilo-Core System

Future heterogeneous systems will integrate CPUs and GPUs on a single chip to achieve high computing performance as well as high throughput. In general, it would discard the current discrete pattern and will build a uniformed shared memory system avoiding explicit data movement among CPUs and GPUs connected by high throughput NoC. We propose a scalable cache coherence solution Fusion Coherence ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000